HALSCRIB (c) 1987 Hal Mueller Page 1 of 7 The paper consists of the following sections: 1. Overview and approach of other computerized classic games 2. Elements and Rules of Cribbage 3. Optimal Strategy selection and deviation therefrom employed during a game by the computer and the human opponent 4. Preliminary results against human opponents of varying skill levels 5. Summary and Conclusions 6. Further Research 1. Other Computerized Classic Games A number of 2-person zero-sum games can be broadly classified into those where i. all information is available to both players ii. some information is available to both players iii. the element of chance plays no role whatsoever iv. the element of chance plays some role in the outcome v. the margin of victory may double the winner's payoff Games such as CHESS, GO, and REVERSI/OTHELLO belong to categories i. and iii. POKER belongs to ii and iv. BACKGAMMON belongs to categories i, iv, and v. CRIBBAGE and DOMINOES belong to categories ii, iv, and v. Typical move selection heuristics for Chess-playing programs include mini-max search (including alpha-beta pruning), search for "killer" moves, and iterative deepening based on an evaluation function. The depth and width of the search materially affect the strength of the program. In Go, the size of the search tree precludes the approach taken in Chess. Pattern-matching heuristics in conjunction with massive parallel processing probably will result in GO-playing programs that approach that of a human expert. HALSCRIB (c) 1987 Hal Mueller Page 2 of 7 Work done in computer Poker indicates a weakness on the reliance on expectations and the use of measurements of the opponents' past performance for judging "Bluffing" situations. Programs written by Findler et al implemented different playing strategies, Mathematically Fair (static strategy - decides when and what to bet by equating expected values of wins and losses - bets strictly according to the odds), Statistically Fair (dynamic strategy - includes a learning component that identifies opponents' extent and frequency of bluffing), and several others (RH Player, Bayesian Player, Pattern Recognition Player, Advice Taker and Inquirer, Quasi-Optimizer Player, etc). Backgammon programs by H. Berliner, Carnegie-Mellon, utilize a smoothing approach to strategy variation as dictated by the relative status of the players' remaining men. He classified different backgammon positions such that the evaluation space provided for increased importance of particular features. He controlled these transitions (dependent on other features as well) by "application coefficients" which vary slowly and smoothly avoiding the rough boundaries which exist between the different classified positions. (This approach is called SNAC, for smoothness, nonlinearity, and application coefficients). Most Cribbage games encountered by the author follow a static strategy based on heuristics, regardless of the relative status of the players' scores. During the selection of the discards, and the play of the hands, several opportunities exist to vary strategical decisions. 2. Elements and Rules of Cribbage Cribbage is one of the most enjoyable card games ever developed for two players. Not only does it provide scope for informed strategic decisions, but also offers numerous ways of scoring, both in the play and in the count of the hands and crib. Initially, an ordinary deck of 52 playing cards is shuffled. Each player "cuts" the deck, exposing a card. By mutual agreement beforehand, the player cutting the lower (or higher) is designated as the "dealer" for the first deal. The deal alternates afterwards. The dealer shuffles the deck randomly and then deals six cards alternately, one at a time, face down to each player. The players examine their hands, and decide which two cards to discard face down in the crib. The crib belongs to the dealer. After the crib has been formed, the non-dealer cuts the cards. The dealer then turns face up the top card of the lower deck which is then placed on top of the deck. This top card is the up-card (or "starter"). HALSCRIB (c) 1987 Hal Mueller Page 3 of 7 The remaining four cards are retained and become revealed during the play. The non-dealer begins the play by laying out one of his cards face up and announcing its value. (Aces are 1, face- cards are all 10, remaining cards have a value equal to their pip-value). The dealer then lays out a card in the same fashion, but announces the cumulative sum - previous cumulative sum plus the value of his card. The play alternates until the cumulative sum is 31 or neither player can play a card without exceeding 31. The cumulative sum is reset to 0 and play commences as before until all four cards have been layed out by both players. During the play either player can score points in a number of ways. This process of obtaining points is called "pegging". After the hand has been played, the non-dealer computes the value of his hand in conjunction with the up-card which can be included in the evaluation of each player's hand as well as the crib. After the non-dealer has scored his hand, the dealer scores his hand and then his crib. The first player to reach 121 points wins the game. If the other player has less than 91 points but more than 60, the value of the game is doubled ("skunked"), or the loser has less than 61 points the value of the game is quadrupled ("double-skunk"). Points can be scored in a number of ways, some of which are similar to the way in which points during pegging may be scored. Scoring During Pegging: 15 or 31: If a player makes the cumulative sum = 15 or = 31, then that player pegs 2 points (increases his score by 2). Pairs: If a player plays a card of the same rank as the last one (in play), then that player pegs 2 points. If, after a pair has been made, a card of the same rank can be legally played, then that player pegs 6 points (3-of-a-kind, triplets, or Pairs Royal). If, after a 3-of-a-kind has been made, a card of the same rank can be legally played, then that player pegs 12 points (4-of-a- kind, double pairs, or double Pairs Royal). (Can only happen with cards whose rank is 7 or less.) HALSCRIB (c) 1987 Hal Mueller Page 4 of 7 Sequences or Runs: If 3 or more cards in uninterrupted numerical sequence, not necessarily of the same suit nor in strict ascending or descending sequence - eg 3-4-2-5 but not 3-4-6-2 have been played, the player who completed the run or sequence pegs 1 point for each card in the uninterrupted sequence. In the first example the player who played the 2 pegs 3 points (run of 3) and the player who played the 5 pegs 4 points (run of 4). Whereas in the second example there are no runs at all, but the player on turn could peg 5 points by playing a 5 (run of 5). Last-card or Go: If a player on turn is unable to play another card without exceeding 31, he says Go and the other player must go on playing, if he can, until he reaches 31 or can not play another card without exceeding 31 himself. If he is also unable to play, he also announces Go and pegs 1 point. The last card played scores 1 point unless it makes the cumulative sum = 31, in which case only 2 points are pegged. Scoring of Hands and Crib: After pegging has been completed, the non-dealer places his cards face up and counts his hand including the up-card and makes as many scoring combinations of 15's, pairs, runs, flushes, as possible. 15's: Every combination of cards (with or without the up-card) that totals 15 scores 2 points. Pairs: Every combination of cards that forms a pair (with or without the up-card) scores 2 points. Runs: Every combination of cards (with or without the up-card) that forms a run of 3 or more, regardless of suit, scores 1 point for each card in the run. Flushes: Four cards of the same suit in the hand or all cards of the same suit as the up-card scores 1 point for each card of the same suit. (Score is 4 or 5). However, the crib must have all cards of the same suit as the up- card to score 5. (Score of 4 is not possible.) HALSCRIB (c) 1987 Hal Mueller Page 5 of 7 His Nibs (His Nobs, Jack-in-the-Hand or Crib): One point is scored for having a Jack in the hand or in the crib of the same suit as the up-card. Up-card: If a Jack is the Up-card, the dealer pegs 2 points immediately after the cut has been made. Optimal Strategy Selection Hand Selection from 6 Dealt There are 15 different combinations of 4 cards (to be retained in the hand) and 2 cards (to be discarded into the crib). The program decides which of 7 different strategies it should adopt: max of 4-card scores (ignoring up-card and crib) max of max best 5-card score - min crib | dealer is opp max of max best 5-card score + max crib | dealer is comp max of expected 5-card score - expected crib | dealer is opp max of expected 5-card score + expected crib | dealer is comp max of expected 5-card score - max crib | dealer is opp max of expected 5-card score + expected crib | dealer is comp The strategy selected is dependent on the score and the deal. Whenever the difference in score between the two players is less than 16, the strategy it adopts is the "optimal" (expected hand and expected crib are utilized). This strategy is "optimal" in the sense that the player will maximize his points, on average, over a large number of plays of a particular holding. If it is behind by more than 15 points it adopts a "risky" strategy. (Chooses max of best 5-card hand + max crib or - min crib). This strategy is risky because it depends on obtaining the most favourable up-card. This up-card would not, in general, be the most likely. If it is ahead by more than 15 points it adopts a "safe" strategy. (Chooses max of expected hand + expected crib or - max crib). If the computer is the dealer, its strategy is similar to the optimal strategy (ties are broken differently). However, if the opponent is the dealer, then it will choose its discards so as to minimize their benefit to the opponent given the most undesirable up-card. Again, this unfavourable up-card would not, in general, be the most likely. If it can reach game, it adopts a "safe" strategy. If the opponent counts first and is expected to reach 121, then it adopts a "risky" strategy. HALSCRIB (c) 1987 Hal Mueller Page 6 of 7 If it counts first and the opponent is expected to reach 121, then it adopts a "safe" strategy if it has less than 91 and can exceed 90 based on its hand (to avoid a skunk) and similarly if it has less than 61 and can exceed 60 (to avoid a double-skunk). If it expects to reach game and the opponent counts first and has less than 91 (or 61) and expects the opponent to avoid the skunk (or double-skunk), then it adopts a "risky" strategy. During the play of the hand, the current point differential is used in selecting its strategy: safe, optimal, risky. For each possible legal card, it evaluates the expected pegging score (using current probabilities), the maximum possible net pegging score (ignoring probabilities), and the maximum possible opponent pegging score (ignoring probabilities). The optimal strategy selects the max expected pegging score. The safe strategy selects the min of opponent max pegging score. The risky strategy selects the max of net pegging score. If it is on lead and there is no pair in its holding, it will select that card which maximizes its expected value. If there is a pair, the expected value of always leading a card from a pair is greater than from any non-pair card playing against an unbiased opponent. Because a human opponent can easily detect this bias, it will forego the mathematically sound play of the pair and choose an "inferior" card between 50-80% of the time. (The percentage is dependent on the point differential). Similarly, if it is on play after the opening lead, it will select that card which maximizes its expected value. Because a bias in always pairing if possible also can be detected easily, it will forego the mathematically optimum play and select an "inferior" card between 35-65% of the time. These percentages were derived from the following "Pay-off Matrix" and then adjusted to reflect actual frequencies of applicability. Table I (simplistic) Table II (probabilistic) || 4 -2 || || .40 -.23 || || -2 2 || || -.25 .24 || Optimal strategy is to lead a pair 40% (44%) of the time - pair the first card 40% (42%) of the time. These ratios are adjusted to allow for those situations when no choice is possible (no pair available to lead - no card available to pair). Preliminary Results During the initial trials of the program against myself and knowledgable cribbage players, it became obvious that when on lead it invariably led from a small pair whenever it could. This selection is optimal if there is no bias in the play of its opponent or if the opponent also plays in "optimal" fashion. However, a human opponent can take advantage of this prediliction and avoid pairing cards immediately at some slight risk of not pegging anything or being paired by the computer later in the play of the hand. HALSCRIB (c) 1987 Hal Mueller Page 7 of 7 A change was made whereby it deviated from the (biased) optimum, so that when it had a pair, it made that selection 65% of the time in "optimal" situations, 80% of the time in "safe" situations, and 50% of the time in "risky" situations. (In "safe" situations, a pair by the risk-taking opponent would allow it to increase its lead.) (In "risky" situations, a cautious opponent would avoid pairing so as to avoid having the computer peg 6 for 3-of-a-kind. It would increase the possibility of it pairing its own card later in the play.) This made the computer far less predictable and consequently increased its playing strength. Cumulative tally: Before (biased optimum) - 20 wins by computer 31 wins by opponents After (unbiased optimum) - 72 wins by computer 53 wins by opponents Summary and Conclusions Cribbage is a game where a knowledge of permutations and combinations is essential to optimal play. A knowledge of the predictable playing style of the opponent can materially affect the outcome of an extended series of games. Further Research At present, the program adopts a "static" statistically optimum mixed strategy when leading and pairing (see Pay-off Matrix) against all opponents. The next step is to monitor the pegging-style of each opponent and adjust its own pegging-style ratios accordingly. In addition, it should use current probabilities in computing the value of the pay-off matrix entries. The result would be a "dynamic" sta- tistical optimal mixed strategy for the initial lead or initial response to the opening lead. How this program would fare against an opponent who uses "pure" strategies only or against a "static" statistical mixed strategy in a match of 15 games would be interesting. BIBLIOGRAPHY and REFERENCES Anderson, Douglas, All About Cribbage, Winchester Press, NYC, 1971 Berliner, Hans, Computer Backgammon, Scientific American June 1980 Findler, Nicholas V., Computer Poker, Scientific American July 1978 Levy, David, Computer Gamesmanship, Century Publishing, London, 1983